Quantized Stationary Control Policies in Markov Decision Processes

نویسندگان

  • Naci Saldi
  • Tamás Linder
  • Serdar Yüksel
چکیده

For a large class of Markov Decision Processes, stationary (possibly randomized) policies are globally optimal. However, in Borel state and action spaces, the computation and implementation of even such stationary policies are known to be prohibitive. In addition, networked control applications require remote controllers to transmit action commands to an actuator with low information rate. These two problems motivate the study of approximating optimal policies by quantized (discretized) policies. To this end, we introduce deterministic stationary quantizer policies and show that such policies can approximate optimal deterministic stationary policies with arbitrary precision under mild technical conditions, thus demonstrating that one can search for ε-optimal policies within the class of quantized control policies. We also derive explicit bounds on the approximation error in terms of the rate of the approximating quantizers. We extend all these approximation results to randomized policies. These findings pave the way toward applications in optimal design of networked control systems where controller actions need to be quantized, as well as for new computational methods for generating approximately optimal decision policies in general (Polish) state and action spaces for both discounted cost and average cost.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Near Optimality of Quantized Policies in Stochastic Control Under Weak Continuity Conditions

This paper studies the approximation of optimal control policies by quantized (discretized) policies for a very general class of Markov decision processes (MDPs). The problem is motivated by applications in networked control systems, computational methods for MDPs, and learning algorithms for MDPs. We consider the finite-action approximation of stationary policies for a discrete-time Markov dec...

متن کامل

Existence of Optimal Policies for Semi-Markov Decision Processes Using Duality for Infinite Linear Programming

Semi-Markov decision processes on Borel spaces with deterministic kernels have many practical applications, particularly in inventory theory. Most of the results from general semi-Markov decision processes do not carry over to a deterministic kernel since such a kernel does not provide “smoothness.” We develop infinite dimensional linear programming theory for a general stochastic semi-Markov d...

متن کامل

On the Use of Non-Stationary Policies for Stationary Infinite-Horizon Markov Decision Processes

We consider infinite-horizon stationary γ-discounted Markov Decision Processes, for which it is known that there exists a stationary optimal policy. Using Value and Policy Iteration with some error ǫ at each iteration, it is well-known that one can compute stationary policies that are 2γ (1−γ)2 ǫ-optimal. After arguing that this guarantee is tight, we develop variations of Value and Policy Iter...

متن کامل

Geometry and Determinism of Optimal Stationary Control in Partially Observable Markov Decision Processes

It is well known that any finite state Markov decision process (MDP) has a deterministic memoryless policy that maximizes the discounted longterm expected reward. Hence for such MDPs the optimal control problem can be solved over the set of memoryless deterministic policies. In the case of partially observable Markov decision processes (POMDPs), where there is uncertainty about the world state,...

متن کامل

Continuous time Markov decision processes

In this paper, we consider denumerable state continuous time Markov decision processes with (possibly unbounded) transition and cost rates under average criterion. We present a set of conditions and prove the existence of both average cost optimal stationary policies and a solution of the average optimality equation under the conditions. The results in this paper are applied to an admission con...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1310.5770  شماره 

صفحات  -

تاریخ انتشار 2013